The objective of this report is to use our open data portal to look at which companies in the taxi for-hire vehicle sector output the most volume of trips over time; by understanding the “top players” in the industry we can establish our reasoning behind the publishing of selective indicators referencing these companies. We leverage the FHV Base Aggregate report on Open Data; below is a sample of what the first 100 rows of that data looks like:
As we can see above the data is represented in a weekly format as trips per base per company and is grouped by “wave_number”. We can unpack this a bit in order to understand what each variable means. The “base_license_number” denotes a base associated with a company which is reflected in the “company” (a company may have more than one base, UBER for instance has over 20 bases). “Wave_number” is associated with a base’s size which in prior years was relevant in separating bases; for our purposes we will not look at “wave_number” as it is not pertinent to this analysis.
As with any industry it is always important to understand the overall numbers over time to see what activity looks like. Below we aggregate trips on a week and year level to see overall volume from the for-hire sector:
Weekly trends above show a massive increase in for-hire vehicle activity, no doubt a result of the rise in apps touting ridesharing options. Note the large drops year to year; this is something that repeats itself in this analyses, an issue that stems from the aggregation of the original data. For now we can omit that fact and look at the trend overall.
How does this increase compare against yellow (MED) and infamous green taxis (SHL)? There is no sure fire way to aggregate at the month level using the weekly report as there is overlap week to week, but surely we can try and output a basic comparison using the weekly numbers and the tlc indicators provided at http://www.nyc.gov/html/tlc/html/technology/aggregated_data.shtml. Below we show the comparison, totaling fhv numbers to show average trips per day each month:
Amazingly we can see that FHV numbers have surpassed medallions already and continue to rise. SHLs continue to fall, a victim of FHV success.
Now that we have looked at the overall industry we can delve into the FHV numbers; below we aggregate on the weekly and company level for each year to see how many trips each company outputs:
The chart above shows the massive growth of companies likle UBER and LYFT which now dominate the industry. The most startling take away from this graph is the vast difference in volume between UBER and everyone else. And yet we don’t know how many vehicles are making these trips; is UBER leveraging an increasing supply? We can look at unique dispatched vehicles to see how many vehicles are making trips for each company:
It seems that vehicle counts are mirroring trips, at least as far as this report tells us.
The above report quickly uses public data sources to legitimize the dissemination of selected company statistics. Should we decide to proceed in publishing certain industry indicators beyond MED and SHLs, we should look to use the above report to demonstrate our logic and express transparency. An argument could also be made to omit all non-app companies as the APP companies still generally command the highest volume in the industry.